1. Introduction

Two-line elements are widely used for space operations to predict the orbit with a moderate accuracy for 2-3 days. Local optimization methods, such as the nonlinear least squares method with differential corrections, can estimate a TLE as long as there exists an initial estimate that provides the desired precision. Global optimization methods to estimate TLEs are computationally intensive, and estimating a large number of them is prohibitive. In this paper, the feasibility of estimating TLEs using machine learning methods is investigated. First, a Monte-Carlo approach to estimate a TLE, when there are no initial estimates that provide the desired precision, is introduced. The proposed Monte-Carlo method is shown to estimate TLEs with root mean square errors below 1 km for space objects with varying area-to-mass ratios and orbital characteristics. Second, gradient boosting decision trees and fully-connected neural networks are trained to map orbital evolution of space objects to the associated TLEs using 8 million publicly available TLEs from the US space catalog. The desired precision in the mapping to estimate a TLE is achieved for one of the three test cases, which is a low area-to-mass ratio space object.

2. Background

2.1 Two-Line Elements

Two-line element sets (TLEs) and Simplified General Perturbations #4 (SGP4) are widely used for space operations to predict trajectories of space objects with moderate accuracy (tens of kilometers) for 2-3 days. SGP4 provides the reference trajectory that is updated by fitting the actual trajectory, and TLEs are the updated parameters that approximate it. SGP4 includes zonal terms up to J5 of the gravitational potential of the Earth, neutral atmosphere with exponential decay and partially modeled third-body mass interactions. Bstar ($B^{*}$) is an element of a TLE that determines the effect of air drag on the trajectories of the space objects. Table 1 shows the parameters of a TLE. However, it should be noted that TLE parameters are mean elements that are estimated by fitting the SGP4 trajectory and observed trajectory, and their values can drastically changed compared to the osculating elements at epoch.

Parameters	Equations	Description
$B^{*}$	$B^{*}$=$\frac{1}{2}\frac{C_{D}A}{m}\rho$	Determines the effect of air drag on the trajectories
$n$	$n$=$\sqrt{\frac{\mu}{a^{3}}}$	Defines the angular frequency of the orbit (shape of the orbit)
$i$	-	Defines the orientation of the orbital plane with respect to the Earth
$\Omega$	-	Defines the orientationof an orbit with respect to the $\hat{z}$ axis, which is parallel to the rotation axis of the Earth
$\omega$	-	Defines the orientation of the orbital plane with respect to the Earth around the axis parallel to the angular momentum vector of the orbit
$e$	-	Defines the shape of the orbit
$m$	-	Defines the location of a space object in its orbit

Table 1 : Parameters of a two-line element set.

2.2 Two-Line Element Estimation Methods

There are a few studies that investigate different approaches to estimate TLEs. As stated above, TLEs include the mean states estimated by fitting observations to the dynamics provided by SGP4, and they can only be used with SGP4 [1]. Keplerian orbital elements are used as initial estimates of TLEs by the differential corrections and nonlinear least squares methods [2]. Kalman filter is investigated to estimate TLEs using onboard Global Positioning System (GPS) data [3]. However, the above methods, which search for the local minimum of the objective function, which is the sum of the squares of the position and velocity errors, depend on the availability of a reasonable initial estimate of a TLE. Although methods, such as genetic algorithms [4] and invasive weed optimization [5], that do not require an initial estimate of the TLEs have been investigated, they are reported to be computationally intensive as they search for the global optimum. To summarize, the two main approaches in the literature to estimate a TLE use either computationally expensive global search or local search which depends on having an initial estimate. This represents a significant shortcoming in the current state-of-the-art of orbit determination using TLEs. The present work addresses this shortcoming.

2.3 Machine Learning

This work utilizes machine learning methods, namely the gradient boosting trees and fully-connected neural networks, to approximate the inverse mapping of publicly available SGP4 algorithm [1] for LEO objects by learning to map the orbital evolution to the associated TLE. The capability of approximating such mapping using machine learning will enable to represent time series orbital data in latent space that can be used with orbit propagators. The gradient boosting trees are machine learning models that are suitable for determining non-linear and sharp decision boundaries. The boosting is achieved by adding weak learners, such as decision trees, to determine the complex decision boundaries. The first successful application of such an idea is adaptive boosting (AdaBoost). AdaBoost generates a sequence of classifiers, and each classifier puts more weight on the samples that are not classified accurately in the previous iteration. Such an approach enables each classifier to capture different features in the input data, and more complex decision boundaries can be defined in the end. The boosting idea is further improved by defining the framework as a numerical optimization in the function space. The objective function is minimized by adding weak learners using the steepest-descent method.

Artificial neural networks are machine learning models that are suitable for determining smooth nonlinear decision boundaries. According to the universal approximation theory [6], any Borel measurable function can be approximated by a feedforward neural network as long as it includes enough hidden units [7]. In addition, a neural network may also approximate any discrete function irrespective of its dimensions given that dimension is finite. Although the universal approximation theory states that any closed and bounded continuous function can be approximated by multilayer perceptrons, the training scheme to approximate that function might fail due to overfitting or failure of the optimization algorithm [7].

3. Methods

3.1 Monte-Carlo approach

A Monte-Carlo approach that searches for the global minimum of the sum of the squares of the position and velocity errors to estimate a TLE without any initial data is presented. However, the proposed method is computationally intensive. Therefore, this work also investigates the feasibility of using machine learning to predict reasonable initial estimates of TLEs that can be utilized by the proposed Monte-Carlo approach. Figure 1 outlines the proposed method that extends the differential corrections method using the Monte-Carlo technique to estimate a TLE that satisfies a given criterion. The following procedure describes how to obtain such TLEs.

Figure 1:The Monte-Carlo approach to estimate a TLE.

The inner loop leverages the differential corrections to correct the initial parameters to fit the orbit using non-linear least squares method. The differential corrections scheme iteratively improves the sampled osculating TLE to match the position vector at the epoch time. The stopping criterion is selected as $10^{-8}$ and smaller for the magnitude of position error in kilometers. Equation below shows how a Jacobian matrix, which is computed by the finite-difference method, relates the change in the TLE to the change in the position and velocity vectors at the epoch.

\begin{equation} \label{eq:diffCorrection} \begin{split} \delta tle_{epoch} &= \left(\frac{\partial f_{sgp4}}{\partial tle_{epoch}}\right)^{-1} \delta rv_{epoch}\\ &= (A_{t_{0}})^{-1} \delta b_{t_{0}} \end{split} \end{equation}

where $tle_{epoch}$ is the TLE parameters at the epoch, $f_{sgp4}$ is SGP4, $rv_{epoch}$ are the propagated position and velocity vectors at the epoch, $A_{t_{0}}$ is the Jacobian, and $\delta b_{t_{0}}$ is the residual at the epoch. Equation below shows the normal equation that relates all of the changes in the position and velocity vectors associated with the observational data available to the change in the TLE at epoch time [2]. In this work, all observational data are assumed to have the same weight, therefore, $W$ is assumed to be the identity matrix. Equation above is used for differential corrections in the inner loop and Equation below is used for differential corrections with nonlinear least squares method in the outer loop (Figure 1).

\begin{equation} \label{eq:diffCorrectionNonlinear} \begin{split} \delta tle_{epoch} &= (A^{T}WA)^{-1}A^{T}Wb\\ &= \left(\sum_{i}^{N} A_{(t_{i})}^{T}A_{(t_{i})}\right)^{-1} \sum_{i}^{N} A_{(t_{i})}^{T}b_{(t_{i})} \end{split} \end{equation}

\noindent where $tle_{epoch}$ is the TLE parameters at the epoch, $f_{sgp4}$ is SGP4, $rv_{epoch}$ are the propagated position and velocity vectors at any time, $A_{t_{i}}$ is the Jacobian, and $\delta b_{t_{i}}$ is the residual at any time.

3.2 Machine Learning Assisted Estimation

3.2.1 Feature Selection

In the present work, Equinoctial orbital elements and a parameter that we shall call instance (Table 2) that defines the sequence of the data with fixed time interval are selected as feature vectors (input data for the machine learning models). The above orbital elements are non-singular, and the magnitude of their values are bounded, thus well-suited for the training process. The Keplerian mean motion, in radians per hour, replaces semi-major axis of the Equinoctial elements. The feature vectors are generated from 8,206,374 TLEs which are all LEO objects that have semi-major axes of 8,378 km and smaller in the space catalog. TLEs are restricted to the LEO orbital regime to ensure that all space objects are subject to a significant orbital perturbation due to atmospheric drag.

Figure 2: Feature vectors selected for the machine learning models.

Parameters	Equations
Instance	[2.452, 2.402, ..., 0.049, 0.0]
Mean motion (rad/h)	$n = \sqrt{\frac{\mu}{a^{3}}}$
Ex	$E_{x} = ecos(\omega+\Omega)$
Ey	$E_{y} = esin(\omega+\Omega)$
Hx	$H_{x} = tan(\frac{i}{2})cos(\Omega)$
Hy	$H_{y} = tan(\frac{i}{2})sin(\Omega)$
Lv	$Lv = \nu + \omega + \Omega$

Table 2 : Feature vectors selected for the machine learning models.

3.2.2 Boosting Trees

An optimized distributed gradient boosting library called XGBoost (version 0.7) is used for this work (Available at https://github.com/dmlc/xgboost). A gradient boosting tree model is trained for each element of a TLE, namely Bstar ($B^{*}$), eccentricity ($e$), inclination ($i$), mean anomaly ($m$), mean motion ($n$), right ascension of the ascending node ($\Omega$), and the argument of perigee ($\omega$). The construction of individual gradient boosting trees for each TLE parameter is chosen because they cannot map to a multidimensional output for regression problems due to the architecture of the decision trees, whereas neural networks can.

Table 3 shows the gradient boosting tree hyperparameters that are optimized for training XGBoost models. The learning rate controls the step size for the gradient-based optimizer. The number of estimators controls the number of trees. The maximum depth limits the depth of each tree which is used to increase the complexity of the model. The minimum child weight is required to avoid overfitting of the training data. A node is split as long as the resultant split leads to a positive reduction in the loss function (mean square error in this work). Gamma can control the split of nodes. The rest of the parameters in Table 3 are used to avoid overfitting by forcing the model to be less complex. The hyperparameters of the gradient boosting trees are tuned empirically by considering the bias-variance tradeoff to ensure the trained models have generalization capability (Table 4). The grid search methods are not preferred because the models are trained with large amount of data.

Parameters	$B^{*}$	$i$	$\Omega$	$e$	$\omega$	$m$	$n$
Learning rate	0.1	0.3	0.3	0.3	0.2	0.2	0.1
No. of estimators	100	80	70	70	80	80	100
Maximum depth	20	20	20	20	25	25	25
Min. child weight	7	9	11	13	27	27	1
Gamma	-	-	-	-	0.7	1.0	-
Subsample	0.7	0.7	0.6	0.7	0.1	0.2	1.0
Col. sample by tree	0.7	0.8	0.7	0.7	0.2	0.3	1.0
Col. sample by level	0.7	0.8	0.7	0.7	0.2	0.3	1.0
Alpha	0.2	0.2	0.2	0.1	0.9	0.7	0.0

Table 3 : Hyperparameters of XGBoost.

Models	Training Squared Mean Error	Test Squared Mean Error
$B^{*}$	3.9e-5	6.3e-5
$i$	7.1e-6	2.1e-5
$\Omega$	3.5e-4	2.9e-3
$e$	1.6e-6	1.3e-5
$\omega$	0.21	0.32
$m$	0.15	0.26
$n$	9.8e-9	2.2e-8

Table 4 : Bias-Variance trade-off for XGBoost.

3.2.3 Deep Learning

Since neural networks can be trained with additional data without being trained from scratch, they are more versatile compared to other methods such as decision trees. In this paper, a fully-connected neural network architecture (Keras-version 2.2.0 using Tensorflow backend-version 1.8.0) is chosen to approximate the TLE estimation scheme (Figure 3). It should be noted that the inputs are standardized.

Figure 3: Fully-connected neural network architecture used for the work.

Table 5 shows the hyperparameters of the fully-connected neural network that are optimized for neural network models. The Adam optimizer is chosen due to its computational efficiency for problems with large amount of data, and it computes the moving average of the gradients and squared gradients. The parameters of the Adam optimizer that control the decay rate of the moving averages {KA2015} are $\beta_{1}$ and $\beta_{2}$. The learning rate controls the step size for the Adam optimizer while the decay parameter decays the learning rate at each epoch. The hyperparameters of the fully-connected neural networks are tuned empirically by considering the bias-variance tradeoff to ensure the trained models have generalization capability (Table 6). The grid search methods are not preferred because the models are trained with large amount of data.

Parameters	$B^{*}$	$i$	$\Omega$	$e$	$\omega$	$m$	$n$
Learning rate	9e-5	9e-5	9e-5	9e-5	9e-5	9e-5	1e-5
Optimizer	Adam	Adam	Adam	Adam	Adam	Adam	Adam
$\beta_{1}$	9e-4	9e-4	9e-4	9e-4	9e-4	9e-4	9e-4
$\beta_{2}$	999e-6	999e-6	999e-6	999e-6	999e-6	999e-6	999e-6
Decay	1e-07	1e-07	1e-07	1e-07	1e-07	1e-07	1e-07
Epoch number	50	50	50	50	50	50	50

Table 5 : Hyperparameters of FC network.

Models	Training Squared Mean Error	Test Squared Mean Error
$B^{*}$	1.4e-6	2.4e-6
$i$	1.3e-4	9.8e-5
$\Omega$	0.02	0.01
$e$	1.8e-7	2e-7
$\omega$	0.24	0.22
$m$	0.2	0.3
$n$	4.6e-9	5.3e-9

Table 6 : Bias-Variance trade-off for the FC network.

4. Results

4.1 Monte-Carlo Approach

A novel approach to estimate a TLE from a given ephemeris without any initial estimate of the TLE using a Monte-Carlo method is introduced. Three particular cases with varying area-to-mass ratios, orbital characteristics and solar activity phases, where the standard differential corrections with the nonlinear least squares method {VC2008} cannot generate a valid TLE, are chosen as test cases. Initial orbital parameters for the test cases are propagated forward in time using a three-degree-of-freedom (3-DOF) numerical orbit propagator over 1 day. The orbital evolution is defined in the Vehicle Velocity, Local Horizontal (VVLH) reference frame (Figure 4).The perturbations included are: air drag (Harris-Priester model), the Earth's aspherical gravitational potential with order and degree 20 (Holmes-Featherstone model), solar radiation pressure (Lambertian sphere model), and Sun and Moon third-body gravitational perturbations. Parameters of the test cases used for the numerical propagation are given in Table 7.

Figure 4: The Vehicle Velocity, Local Horizontal (VVLH) reference frame.

Parameters	Test case #1	Test case #2	Test case #3
Epoch (UTC)	2018-12-12T18:47	2014-01-15T10:37	2009-08-03T15:18
Period (min)	$99.71$	$94.65$	$97.32$
Mass (kg)	$800$	$3$	$1$
Area ($m^{2}$)	$6$	$0.03$	$0.1$
$n$ ($\frac{rad}{s}$)	$0.00105$	$0.001106$	$0.001076$
$i$ (deg)	$101.45$	$85.12$	$97.60$
$\Omega$ (deg)	$53.03$	$90.16$	$253.29$
$\omega$ (deg)	$9.33$	$204.08$	$235.32$
$m$ (deg)	$359.44$	$196.9$	$178.81$
$e$	$0.022353$	$0.011556$	$0.025822$
$C_{D}$	$2.2$	$1.5$	$0.8$

Table 7 : The parameters of the test cases used for the numerical propagator.

Figure 4 shows the absolute relative distance between the numerically propagated orbit and the SGP4propagated orbit when the inner loop is satisfied for the test cases.

Figure 4: the absolute relative distances with respect to numerically propagated orbit when Keplerian elements are used as TLEs.

Figure 5 shows the absolute relative distance between the numerically propagated orbit and the SGP4propagated orbit when the inner loop is satisfied for the test cases.

Figure 5: Absolute relative distance between the numerically propagated orbit and the SGP4propagated orbit when the inner loop is satisfied.

Figure 6 shows the absolute relative distance with respect to numerically propagated orbit when Kep-lerian elements are used as TLE for the test cases.

Figure 6: Absolute relative distance with respect to numerically propagated orbit when Kep-lerian elements are used as TLE.

In conclusion, the present Monte-Carlo method can estimate a TLE without any initial estimate. Such capability is beneficial to space situational awareness (SSA) because TLEs are required for establishing first contact with the spacecraft during the Launch and Early Orbit Phase (LEOP). Most ground stations have propriety software that requires TLEs to track the satellites. Although the proposed method addresses the difficulty related to unavailability of the initial estimate, the inner loop is iterated 50 times on the average to find an initial estimate with desired precision. Therefore, the feasibility of using machine learning methods to generate initial estimates with the desired precision is investigated.

4.2 Machine Learning Assisted TLE Estimation

The performances of machine learning methods differ based on the features selected. Two different machine learning models are trained to predict each mean elements of a TLE, and the best performing model is chosen to be validated by the errors in the orbital evolution. The error in the orbital evolution is computed by propagating TLEs estimated by the selected best machine learning models and the associated CSpOC TLEs backward in time using SGP4 for 50 orbital periods. Table 8 summarizes the best performing models for each mean element that are discussed in detail above.

Mean elements	Best machine learning models
$B^{*}$	Gradient boosting trees
$n$	Fully-connected neural network (unless the absolute residual between two models are larger than 0.0002 rad/min for values of $n$ between 0.053 rad/min and 0.067 rad/min)
$i$	Gradient boosting trees (unless the absolute residual between two models are larger than 0.014 rad)
$\Omega$	Gradient boosting trees
$\omega$	Gradient boosting trees
$m$	Fully-connected neural network
$e$	Fully-connected neural network

Table 8 : Best machine learning models for predicting mean elements of a TLE.

Figure 7 shows the absolute relative distances with respect to numerically propagated orbits when TLEs estimated by machine learning models are used for the test cases.

Figure 7: Absolute relative distances with respect to numerically propagated orbits when TLEs estimated by machine learning models are used.

Figure 8 shows the absolute relative distances between the numerically propagated orbits and the SGP4 propagated orbits when the initial estimates that are computed by the machine learning models are improved by nonlinear least squares with differential corrections method for the test cases.

Figure 8: Absolute relative distances between the numerically propagated orbits and the SGP4 propagated orbits when the initial estimates that are computed by the machine learning models are improved by nonlinear least squares with differential corrections method.

In conclusion, the present best machine learning models (selected based ontheir performance) that are trained with TLEs obtained from the official US Space Catalog have the potential to provide an initial estimate to estimate a TLE with desired precision. The inner loop, which searches for the global minimum by using brute force approach, in the Monte-Carlo TLE estimation method can be replaced by machine learning models because there isno secular error growth in the orbital evolution of TLEs predicted by machine learning models (test case #1). Therefore, the standard differential corrections with nonlinear least squares method can converge to a local minimum, and the computation time can be reduced significantly.

5. References

[1] Vallado, D., Crawford, P., Hujsak, R. and Kelso, T.S., 2006, August. Revisiting spacetrack report# 3. In AIAA/AAS Astrodynamics Specialist Conference and Exhibit (p. 6753).

[2] Vallado, D. and Crawford, P., 2008, August. SGP4 orbit determination. In AIAA/AAS Astrodynamics Specialist Conference and Exhibit (p. 6770).

[3] Montenbruck, O. and Gill, E., 2000, June. Real-Time estimation of SGP4 orbital elements from GPS navigation data. In International Symposium Space Flight Dynamics (pp. 26-30). Biarritz, France.

[4] Goh, S.T. and Low, K.S., 2018, March. Real-time estimation of satellite's two-line elements via positioning data. In 2018 IEEE Aerospace Conference (pp. 1-7). IEEE.

[5] Bolandi, H., Ashtari Larki, M.H., Sedighy, S.H., Zeighami, M.S. and Esmailzadeh, M., 2015. Estimation of Simplified General Perturbations model 4 orbital elements from global positioning system data by invasive weed optimization algorithm. Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering, 229(8), pp.1384-1394.

[6] Cybenko, G., 1989. Approximation by superpositions of a sigmoidal function. Mathematics of control, signals and systems, 2(4), pp.303-314.

[7] Goodfellow, I., Bengio, Y. and Courville, A., 2016. Deep learning. MIT press.

[8] Kingma, D.P. and Ba, J., 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

6. Code

Since the codes used to generate this research includes different components, codes are not shared in the blog post. Please contact me for the source code.